Skip to main content

View Post [edit]

Poster: David Fifield Date: Jun 6, 2018 3:44pm
Forum: software Subject: dosbox emulation fails when zip filename contains a comma (gets double-escaped to %252C)

Emulation of this item fails:
https://archive.org/details/TEST_MicroCom_173a_PC-Chart_1_of_2
  • Game Metadata✔
  • Game File List✔
  • Emulator Metadata✔
  • Game File (1 of 1)✘


The reason seems to be that the zip file contains a comma character, and something is applying two layers of URL quoting to it, when there should be only one. The double-quoting doesn't happen to other characters like slash: comma becomes %252C but slash becomes %2F. The cors_get.php endpoint doesn't understand the extra quoting and returns status code 417. The error doesn't happen in other, similar items whose zip filename does not contain a comma, like https://archive.org/details/MicroCom_113_Image-3D.

Looking in the browser console, the erroneous request is:
GET https://ia801507.us.archive.org/cors_get.php?path=%2F2%2Fitems%2FTEST_MicroCom_173a_PC-Chart_1_of_2%2FMicroCom_173a_PC-Chart%252C_1_of_2.zip 417 (Expectation Failed)

Changing the %252C to %2C makes it work:
GET https://ia801507.us.archive.org/cors_get.php?path=%2F2%2Fitems%2FTEST_MicroCom_173a_PC-Chart_1_of_2%2FMicroCom_173a_PC-Chart%2C_1_of_2.zip


I suppose I can work around this problem by renaming my zip files.

This post was modified by David Fifield on 2018-06-06 17:05:54

Edit: Change MicroCom_173a_PC-Chart_1_of_2 to TEST_MicroCom_173a_PC-Chart_1_of_2 so I can change the filenames in the original.

This post was modified by David Fifield on 2018-06-06 22:44:42

Reply [edit]

Poster: David Fifield Date: Jun 6, 2018 3:48pm
Forum: software Subject: Re: dosbox emulation fails when zip filename contains a comma (gets double-escaped to %252C)

There's a similar but opposite error when the zip filename contains a percent character. In this case, the problem is not too much escaping, but too little. A filename containing the three-byte sequence %2f should become %252f when URL-encoded, but it is apparently begin decoded and then re-encoded, in a no-op that results in %2f.

This item exhibits the problem:
https://archive.org/details/TEST_MicroCom_185a_DOS_Ref._Man._1_2
The erroneous request is:
Failed to load https://cors.archive.org/cors/TEST_MicroCom_185a_DOS_Ref._Man._1_2/MicroCom_185a_DOS_Ref._Man.,_1%2f2.zip: Redirect from 'https://cors.archive.org/cors/TEST_MicroCom_185a_DOS_Ref._Man._1_2/MicroCom_185a_DOS_Ref._Man.,_1%2f2.zip' to 'https://archive.org/about/404.php' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://archive.org' is therefore not allowed access.

Correcting the %2f to %252f allows it to advance to the cors_get.php stage; then it fails with the same double-encoding error as in the parent comment. It leads to
GET https://ia801506.us.archive.org/cors_get.php?path=%2F23%2Fitems%2FTEST_MicroCom_185a_DOS_Ref._Man._1_2%2FMicroCom_185a_DOS_Ref._Man.%252C_1%25252f2.zip 417 (Expectation Failed)

It should rather have %2C and %252f than %252C and %25252f:
GET https://ia801506.us.archive.org/cors_get.php?path=%2F23%2Fitems%2FTEST_MicroCom_185a_DOS_Ref._Man._1_2%2FMicroCom_185a_DOS_Ref._Man.%2C_1%252f2.zip


Edit: Changed MicroCom_185a_DOS_Ref._Man._1_2 to TEST_MicroCom_185a_DOS_Ref._Man._1_2 so I can change the filenames in the first one.

This post was modified by David Fifield on 2018-06-06 22:48:46